Goto

Collaborating Authors

 distributional semantic model


Are LLMs Models of Distributional Semantics? A Case Study on Quantifiers

Enyan, Zhang, Wang, Zewei, Lepori, Michael A., Pavlick, Ellie, Aparicio, Helena

arXiv.org Artificial Intelligence

Distributional semantics is the linguistic theory that a word's meaning can be derived from its distribution in natural language (i.e., its use). Language models are commonly viewed as an implementation of distributional semantics, as they are optimized to capture the statistical features of natural language. It is often argued that distributional semantics models should excel at capturing graded/vague meaning based on linguistic conventions, but struggle with truth-conditional reasoning and symbolic processing. We evaluate this claim with a case study on vague (e.g. "many") and exact (e.g. "more than half") quantifiers. Contrary to expectations, we find that, across a broad range of models of various types, LLMs align more closely with human judgements on exact quantifiers versus vague ones. These findings call for a re-evaluation of the assumptions underpinning what distributional semantics models are, as well as what they can capture.


Paraphrasing, textual entailment, and semantic similarity above word level

Kovatchev, Venelin

arXiv.org Artificial Intelligence

This dissertation explores the linguistic and computational aspects of the meaning relations that can hold between two or more complex linguistic expressions (phrases, clauses, sentences, paragraphs). In particular, it focuses on Paraphrasing, Textual Entailment, Contradiction, and Semantic Similarity. In Part I: "Similarity at the Level of Words and Phrases", I study the Distributional Hypothesis (DH) and explore several different methodologies for quantifying semantic similarity at the levels of words and short phrases. In Part II: "Paraphrase Typology and Paraphrase Identification", I focus on the meaning relation of paraphrasing and the empirical task of automated Paraphrase Identification (PI). In Part III: "Paraphrasing, Textual Entailment, and Semantic Similarity", I present a novel direction in the research on textual meaning relations, resulting from joint research carried out on on paraphrasing, textual entailment, contradiction, and semantic similarity.


Kelly

AAAI Conferences

We explore replacing the declarative memory system of the ACT-R cognitive architecture with a distributional semantics model. ACT-R is a widely used cognitive architecture, but scales poorly to big data applications and lacks a robust model for learning association strengths between stimuli. Distributional semantics models can process millions of data points to infer semantic similarities from language data or to infer product recommendations from patterns of user preferences. We demonstrate that a distributional semantics model can account for the primacy and recency effects in free recall, the fan effect in recognition, and human performance on iterated decisions with initially unknown payoffs. The model we propose provides a flexible, scalable alternative to ACT-R's declarative memory at a level of description that bridges symbolic, quantum, and neural models of cognition. Our intent is to advance toward a cognitive architecture capable of modeling human performance at all scales of learning.


Novel Aficionados and Doppelg\"angers: a referential task for semantic representations of individual entities

Bruera, Andrea, Herbelot, Aurélie

arXiv.org Artificial Intelligence

In human semantic cognition, proper names (names which refer to individual entities) are harder to learn and retrieve than common nouns. This seems to be the case for machine learning algorithms too, but the linguistic and distributional reasons for this behaviour have not been investigated in depth so far. To tackle this issue, we show that the semantic distinction between proper names and common nouns is reflected in their linguistic distributions by employing an original task for distributional semantics, the Doppelg\"anger test, an extensive set of models, and a new dataset, the Novel Aficionados dataset. The results indicate that the distributional representations of different individual entities are less clearly distinguishable from each other than those of common nouns, an outcome which intriguingly mirrors human cognition.


Don't Blame Distributional Semantics if it can't do Entailment

Westera, Matthijs, Boleda, Gemma

arXiv.org Artificial Intelligence

Distributional semantics has emerged as a promising model of certain'conceptual' aspects of linguistic meaning (e.g., Landauer and Dumais 1997; Turney and Pantel 2010; Baroni and Lenci 2010; Lenci 2018) and as an indispensable component of applications in Natural Language Processing (e.g., reference resolution, machine translation, image captioning; especially since Mikolov et al. 2013). Yet its theoretical status within a general theory of meaning and of language and cognition more generally is not clear (e.g., Lenci 2008; Erk 2010; Boleda and Herbelot 2016; Lenci 2018). In particular, it is not clear whether distributional semantics can be understood as an actual model of expression meaning - what Lenci (2008) calls the'strong' view of distributional semantics - or merely as a model of something that correlates with expression meaning in certain partial ways - the'weak' view. In this paper we aim to resolve, in favor of the'strong' view, the question of what exactly distributional semantics models, what its role should be in an overall theory of language and cognition, and how its contribution to state of the art applications can be understood. We do so in part by clarifying its frequently discussed but still obscure relation to formal semantics. Our proposal relies crucially on the distinction between what linguistic expressions mean outside of any particular context, and what speakers mean by them in a particular context of utterance.